










Slagle's program SAINT (Symbolic Automatic INTegratorj 
which was finished In 1961 Is currently the only publish- ‘ 
genera! integration program* ft Is unquestionably a tour de 
force In recursive programming, in the use of heuristics , 
and In the simulation of human behavior by a computer. 
However a many people who have read the description of SAINT 
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has 


which fall outside the range of the algorithms 

W e h a v e expert me n t © 4 w 11 h a pro g r a rn wh I c h n a s n i n e 
algorithms and have found It to be able to solve all but Two 
of the problems that SAINT solved. It was able to solve" one 
of the two' problems attempted but not solved by SAINT, !t Is 
also, capable of solving many problems not attempted by 
SAINT# We do not believe that the converse Is true to firnv 
appreciable extent* The program runs at speeds which ore 
frequently two to three orders of magnitude faster than 
SAINT ©yen though much of the; program Is still uncomp! led 

half the running time, We have also 
on on a heuristic for Integration 
remarkably powerful* The heuristic Is guess 

the solution# differentiate the form#.’ and 
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an extension of 
... , it Is recursive 

generate subproblems which ere progressive! 
this heuristic# for instance# we can solve 
our algorithmic program was unable to solve 
wo®« !t Is au r hope that w!th heurIst1cs s 
shall be able to get the solution to problems 
urniy ^ a! ff Icui t«, We cons!dor f nfcegrals 
Integration tables to be fairly simple and obtainable 
relative ease wsth a small number of algorithms which 
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The trouble with the algorithms for Integration is that 
they usually deal wlth classes of expressions which are 
unique In that each member of these classes Is i ntegrabla. 
Since many (probably most) problems are not Integrable in 
closed form, there are likely to be few such classes, and 
Indeed few have been found. Hence the value of a purely 
algorithmic program hinges on Its ability to handle very 
large, but not very complicated, problems. This Is useful 
but, we feel, not very Interesting nor difficult. 

The truly difficult problems In Integration are the 
ones whose Integrablllty Is really In doubt. In general 
there can be no decision procedure for Integration. This was 
recently proved by RIchardson™ We have found a simple 
decision procedure for the case R(x)exp{P(x)) where R Is 
rational and P Is a polynomial. It Is reputed that there 
exists a decision procedure for Integrands which are 
algebraic functions (these Include rational functions and 
roots of rational functions). This is very surprising. If 
true. One must resort to heuristics In order to be able to 
solve efficiently the problems that remain. We might try to 
dlfferentI ate all possible well formed expressions and match 
these with the integrand. This is clearly Inefficient, and 
may take forever. One het?rlstlc which we have found (which 
was motivated by discussions with Professor M. Minsky) Is 
the Edge heuristic mentioned above. it is surprising how 
well it performs by Itself, With the addition of a few 
algorithms such as factoring It should be able to solve all 
problems which are integrable by the algorithms and many 
others. 

The rule which governs the use of the heuristic Is as 
follows: determine the most complicated subexpression In 
the Integrand, guess at a form of the Integral whose 
derivative contains this subexpress? on, then determine the 
value of the variables In the form by making further 
guesseSs ^There Is theoretical evidence In the works of 
tlouvllie that this is a very reasonable approach. We shall 
describe the guessing s'ules (as far as we know them "at 
present) and give some examples below. 


•Selow’ a,b shall stand for undetermined functions of >;„ 
Tie expression involving gCx) Is the most complicated 
subexpression in the Integrand, end fCx) is the rest of the 
Integrando We assume that the Integrand is a product of 


terms., 

Form of the integrand 
f(x)(g(x>)^ 

f(x)exp(g(x)} 


Form of the Integra! 

h 


a(g(x)> ♦ b 

n^O 



a(g(x)) ♦ b 

n< 0 

aCexp(gCx))) * b 
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acosCgCii)) * b 

as!n(g(x)) + b 

alogCgCx}) * b 

When these forms fall to yield an answer, then the 
Integral may contain logarithmic terms (I.e* constants times 
functions of log, arctan, or arcs!n)„ 

f(x)g<x) alogCb^cgCx)) + d 

One should switch to the arctan or arcsln formulations 
when encountering complex constants In the above« 


f(x)sin(g(x)) 
fCx)cos(gCx)) 
f(x)logC g{ x)) 


An example of the use of the Edge heuristic# 


x^m - 

Try a(l - x) z *> b 
Differentiation yields 

**3/2 aC 1 ~ J (**2a) ❖ q / (1 ~ x j ^ ❖ b ® x Cl 

Try a ** x v /C3x) a 1/3 n 5 , hence a ' « x A 
and b / * ~x*(l - x J ) ^ 
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In order to solve for b try 

• -V 
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-1/2 cCl - yt ) *( *2x) c (1 - x)^ 

Try c « -x J /x « ~x 

, -L 

Hence d * Cl ® x J ) 


*d *» -x~ Cl 



The solution for d Is aresfnCx), whlph !s obtained from 
the logarithmic term ITog(»!x ♦ (I - ** r). 

The final answer Is 

i - f -j. 

1/3 x Cl - -x(l - x A J* 3 rarestn(x) 

We must explain some of the steps above which are not 
obviouso.^ In this problem the ’hardest subexpression® is 
Cl - x ; ) J 0 This ’hardest subexpression® Is chosen from 
terms in the Integrand (assuming that the integrand Is a 
product) which can not appear In the derivative of the other 
terms* if a logarithm appears only once In the Integrand, 
the term containing It Is a good candidate for the most 
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There are two criteria by which ana determines when to 
terminate the use of a guess made by the Edge heuristic* 


When one has generated the same subgoal as the original 


problem* then clearly one must quit the 


current 

. /* . 


guess* When 


a subproblem Is generated which Is a constant multiple 

then 
the 
one 

although this Is a 


(different from one) of the original problem* 
transposIt I on Is used to obtain the answer* When 
subproblems generated by the heuristic tend to grow* 
should give up the current guess* 
situation which Is not so clear cut* 

There are some cases with which 
encounter difficulties* In the case of 
the denominator must be factored before 
chance of succeeding* Otherwise* there 
make a guess at the 9 most complicated 
similar difficulty occurs in the case oi 
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rational functions, 
the heuristic has a 
Is no simple way to 
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rational functions 
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The algorithms clearly can take 
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ems 
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of trigonometric function 
over In these casos G 

Another source of difficulty occurs In those probl 
whose solution contains more than one logarithmic term. Vie 
know of no simple way to break up the problem Into 
subproblems which yield the logarithm! 
reason why the-case of algebraic functions 
difficult*. 

The real source of difficulty to an Integration 
program* and to the Edge heuristic In particular* Is due to 
that the character of the Integral may not be what 
because soma transformat ion has taken place which 
Suppose g(x) Is equivalent to zero* and we are 
Integrate St* Unless we realize the equivalence* we 
great difficulties In obtaining the results. Th!* 


Si 


why Integrations Is* In 


the fact 
lt seems 
masks It, 
asked to 
may have 

appears to be the real reason 
general* recursively unsolvable. Every expression which Is 
Integrable Is equivalent to an expression which Is trivially 
Integrable* namely the unsimplified derivative of its 
integral* The difficulty with Integration can be said to be 
due to the transformat Ion which has taken place In the 
derivative^ A single program can only counteract a finite 
number of these transformations, but their number* In some 
sense* Is Infinite,. 
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the algorithmic Integration program* 

The matching program - SCHATCHEN 


There is a set of routines In SAINT which Slagle did 
not describe at all* These routines comprise a recursive 
matching program sailed ElSnst (ELemantary l MSTance) in 
about two pages of LISP code Slagle wrote a very powerful 
routine which .5 so recursive that In months of examining 
and using it we have not been able to figure It out 
completely. It Is so powerful that In the six years since it 
was written no previously documented attempt at a matching 
T? U ,/J? bas com - dose to equaling its power*r net formula 
Algols nor Korsvoid's simpl?ffcation program?' nor the FAMOUS 
system of Fenlchet Host unfortunately, Slagle barfly misused 

5»Ift rou ^! ne * wss >ac@d with enormous difficulties in 
fitting his program in core and thus many of his subroutines 
made use or the power of El Inst in order to gain space at 


a 


gain of an order of 


the expense of time. Probable 
magnitude in speed could have been attained by avoid?nr 
-linst in some matching situations and by usln* sorci Fii 
purposes matches as we have done, * “' c,ai 

, vl ,. l f e . 5l ? ve wr? t ? sn a etching program called SCHATGHFN 
CYSddisb for match-maker) which Is a fcafce-of^ on EISn<i 8 ' it 

Is about as powerful as El fast but Is certainly 'nor as 

SCHATCHEN is a function o? two arguments - an 
expression and a pattern., i ts purpose Is to dafermine 
whether or not there exist values for the variables in tb« 
pattern tor which the pattern becomes equivalent to thf® 
expression* S ts value is either NIL or a dictionary nC 

It v ? t ; labl ® Sc ■* Js very similar tn purpose to 

*I!i I»I i'S®"® ? d f °t 8 C0MST bu£ ‘ s specifically 

designed for algebraic expressions, 

SCHATCHEN assumes that the operators PLUS and TIMFS rn 
commutative operators with variable number of arguments ft 

{?M^ a »L°FY^ 0 V™?' ! ?? n H e5;!S Involving 0 and'l for PLUS, 

IIHES anc! SXP1I £ Is cols fact which gives It 

much of their power over other matching programs 
pothered by missing operators,, 

Below we present a somewhat fictionalized 
of che program and the patterns that 11 accepts® 


and ^ El Inst 
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If 

succeeds* 
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equal 3 the pattern 


the match 


If the pattern Is of the form ( 
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ijnd the expression Is ©, then rCe 
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a i 


the vs!ue of g is false* 


evaluated _ __ 

Otherwise 0 the match succeeds and^CCa 
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! s 


die 
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the match falls*, 
©)) Is appended to 



PAGE 12 


If the pattern Is of the form (op pi p2 p,( q) where 
op Is either PLUS or TIMES, then the following takes place. 
If the expression e Is not of the form (op el e2 . ee ek) 
then It Is converted to Cop eK 

For each subpattern p,; a search Is made of the 
expression for a subexpression ej which will match lt« if 

one Is found, the match continues with the next subpattern. 
If none Is found, a match Is attempted between p; and 0 If 
op*PLUS or 1 If op*TIMES. If the attempt succeeds the match 
continues with the next subpattern* Otherwise the match 
falls. 

If after all the p^ have been matched, the expression 
still contains some terms, an attempt will be made to match 
q with them. Hence q serves the function of the $ In a COMIT 
rule. If no terms remain, then q must match 0 If op**PLUS or 
1 If op*TIMES. 

If op*PLUS and If one of the Is of the form (TIMES® 
rl r2 ... r w (VAR a s args)) then what Is desired Is that a 
should become the coefficient of (TIMES rl r2 ... ) in 

the sum. This Is done by looping over the remainder of the 
expression and matching (TIMES rl r2 ... (VAR a s args)) 
with each summand In It. The value of the match is the sum 
of the Individual matches. 

If the pattern Is of the form (EXPT pi p2) and the 
expression e Is of the form (EXPT el e2), than pi must match 
el and p2 must match e2, or pi must match e and p2 must 
match l e If e Is 1, then p2 must match 0 or pi must match 1, 
If e Is 0, then pi must match 0« 

If the pattern Is of the form (op pi p2 •«« pk), and 
the expression Is of the form Cop el e2 „.. ek), then each 
must match e j * 

All other matches fall. 


Suppose we wish to match for a linear expression In x e 
The following pattern may be usec% 


(PLUSCTIMES* x (VAR B FREE))(VAR A FREE)) 


Here FREE Is the name of a routine which checks to see If 
Its argument contains an x. This pattern when matched with 
the expressions below at the left will yield the results at 
the right* 


3 

x 

(PLUS x PI 2 (TIMES 


CCA . 

3){E . 

0)) 

CCA . 

0) CB . 

D) 


y x>) ((A (PLUS PI 


2)}(B (PLUS 1 y})) 
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There is another facility in SCHATCKEN which allows the 
pattern to specify a loop over the expression. Suppose we 

were interested in performing trigonometric simplification 
such as 

a sin (b) * a cos" 1 {b) , *e ,B a*e a 

The following pattern will perform the matching 
necessary for the application of this rule* TRUE Is '?■ 
function which will accept any Input* 


(PLUSCLOOP(TIMES(EXPT(S 8 N( VAR B TRUE))2)(VAR 
(T8MES(EXPT(C0S(VAR C SCHATCHEN 8))2) 

(VAR 0 SCHATCHEN A))) (VAR E TRUE)) 


A TRUE)) 


• j ^ ^ pattern above a loop will take place over all 
sin and cos expressions until one pair Is found which 
meets the criteria or until all are exhausted. Vie do not 
necessarily recommend this approach to the sfmpUfIcatlon of 
trigonometric expressions. 


a 
be 
the 
the 

of 

BUV 


It would be useful to make some of the follow 
extensions to SCHATCHEN. Currently SCHATCHEN can report 
succesfu! match without yielding a dictionary containing 
value for each variable In the pattern* This could 
changed* for example* by giving default values to 

iran a ° a 0,0 semantic level* one should consider 

VAR pattern to be only of a number of possible modes 
definition of variables* The VAR mode corresponds to the 

r d V f n?o NVE ^ J h S re should a]so a niode corresponding 
to the UAR mode of CONVERT* This would accept any vane 

which meets some criterion the first time the variable is 

encountered* but each successive value of the variable must 

matchthe first one. This is similar to the numerTcal 

prevl ous 
Is 
It 
an 
he 
the 


a 


match the first one* This Is similar to t 
constituent of COMIT which must match 
constituent. From an algebraic standpoint SCHATCHEN 
limited in that it performs no division. For Instanm. 

cannot match a pattern which would correspond to ax "to 
expression such as x^ , Snm« 


expression sucn as x* * Some sort of dlvlsion shoul 
Introduced, but its applicability should be governed by 
user through a new operator such as TIMES**, it would ?Uo 
"j* e to maintain the mode declarations such as VAR 
, pa “?r n ' either as separate arguments to 
", E ’ ,nat or CONVERT, or as some global 

f *i a f, ? F 2 rmu,a J 150 *' Furthermore there should 
exist some facility for performing the construction of an 
expression through the use of the dictionary supplied by the 
match. This can range from a simple construct!on routine ^ 
a very powerful one as In CONVERT 0 


i.O 
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The slmplIfI cat I on program - SCHVUOS 

We have written yet another simplification program. It 
Is called SCHVUOS which Is an acronym for SCHatchen's 
Version of an Unassuming Operational Simplifier, It 5s 

remarkable In that it Is very short* The basic 

simplification routines are only two pages of LSSP code 
long* It assumes no standard form of the expression# Hence 
It Is Inefficient on very large expressions. However the 
size of an expression occur! ng In Integration Is relatively 
small# and hence It Is quite fast for our purposes. 

In a sum the summands are simplified. A summand Is 

first stripped of Its constants and a match is made for a 
constant multiple of this summand using the TIMES* facility 
of SCHATCHEN, This Is done until each summand has been 
accounted for. In a product# after the simplification of 
all the terms# the program collects exponents by using 
SCHATCHEN and matching for a variable power of the terms In 
the product. The usual transformations Involving 0 and 1 
with the operators PLUS# TIMES# and EXPT are all made. 

Furthermore an exponential term whose base is a product Is 
expanded# and constants are multiplied through sums. 

Functions such as log and sin have their own 

simplification routines which perform non-controverslal 
simplifications such as s1n(O)«0. 

There Is an indicator which tells SCHVUOS whether or 
not to simplify subexpressions. It Is usually on except 
during differentiation where the different!ated expression 
Is built from the bottom up and Is simplified at each level# 
with no redundant simplifications being performed./^Th Is Is 
SCHVUOS* alternative to the AUTSIM bit of FORM AC" and to 
Mart In 9 s version thereof^? 



The following routine Is not part of the current 
algorithmic Integration program* 

An Integral TAble Look‘-Up - ITALU (read I-tel 1 ~you) 

We had spent some time building an Integral table 
look-up before realizing that for Integration It would quite 
likely be useless since the program could. In all 
probability. Integrate anything In the table quickly. Since 
this Is not the case for differential equations, definite 
Integration, or summation of series, we would like to 
present what we have done In ITALU* 

The basic steps In ITALU are as follows: An expression 
to be integrated Is first hash-coded. The code (a floating 
point number) Is looked up In an array (binary search) and a 
disk address for It Is found. The disk Is read and forms of 
the Integrands In It are matched with the given expression 
(probably only one exists with the right code) until one is 
found which matches the expression. The coresponding 
integral Is evaluated after substituting In it the results 
of the match. 

For example, the hash-code for (EXPT E (PLUS PI (TIMES 
2 x))D) would be the same as for the form 

(EXPTCVAR C FREE) ( PLUS(TIMES*(VAR V VARPHVAR 8 FREE) ) 

(VAR A FREE)) 

(where VARP tests to see whther Its argument Is equal to x) 
and the Integral found by ITALU would then be 1/2 CEXPT E 
(PLUS 1 PS (TIMES 2 x))). The trick of evaluating the 

integral allows the integral to be a function and thus 
allows iterated integrals to be easily obtained* 

The hash-code Is designed to check for the form of the 
expression and to Ignore constants with respect to the 
variable of Integration, Thus x !s coded like a*bx , but 
s?n(2x) Is coded differently from sfn(jc) or cqs(2x}„ Coding 
is done recursively* The code of a sum is the sum of the 
codes of the summands, ignoring constants* The coda of a 
product is likewise the product of the code of the terms, 
ignoring constants. Trigonometric and logarfthmlc functions 
are coded by exponent I at Ing the code of the arguments by a 
constant which Is different for each function* The cods for 
an exponential term Is more Involved* Since the exponents 
-1*1/2,-1/2,2,-2 occur so frequently# these are made special 
cases* All other constant exponents are given the same 
code* The code for (EXPT a b) Is the code of a raised to the 
code of b e 

We have also experimented with a hash-code which notes 
that a constant occurred, but ignores the value of this 
constant* Thus !*x codes differently from x, but the same 
as 2*x 0 

The advantage of the scheme above Is that if. Is quite 
fast (we believe that one can get running times of about a 
second per Integral, most of which Is spent accessing the 
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disk). Furthermore the speed of the look-up Is not 
appreciably deteriorated with an Increase In the number of 
entries In the table* The scheme yields a powerful table 
look-up because of the use of SCHATCMEN to perform the 
match, which allows variables to appear In the Integrand 
just as they do In the standard Integration tables. The use 
of the device of evaluating the integral allows Iterated 
Integrals to be conveniently entered In the table. 

Integral table look-up schemes reported In the 
literature are to be found In fe). 
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